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Via  the  tensor  product  structure  of  the 
nonlinear  space  we  are  able  to  solve  the  gen- 
eral estimation  problem  of  nonlinear  func- 
tionals of  Gaussian  processes  in  the  sense 
that  we  can  reduce  the  nonlinear  problem  to  a 
standard  linear  estimation  problem,  the  theory 
of  which  has  been  well  developed.  Also  we  in- 
troduce the  concept  of  super  predictor  for  a 
class  of  prediction  problems  and  derive  a 
lower  bound  for  the  mean  square  error  of  the 
nonlinear  prediction. 

1 . BACKGROUND 

Let  X = (X^,  t e T)  be  a zero  mean 
Gaussian  process  defined  on  a probability 
space  (ft,B,P).  B is  usually  taken  to  be  B(X), 
the  o-field  generated  by  the  process  X. 

There  are  two  important  Hilbert  spaces  asso- 
ciated with  the  (Gaussian)  process  X.  The 
nonlinear  space  of  X,  (X)  = (52,  B (X)  , P)  , 

consists  of  all  B (X) -measurable  random  vari- 
ables with  finite  second  moment  which  are 
called  (nonlinear)  L^-f unctional  of  X.  The 

linear  space  of  X,H(X),  is  the  closed  sub- 
space of  L^CX)  spanned  by  Xfc,  t e T,  and  its 

elements  are  called  linear  L2~functionals  of 

X.  If  S is  a subset  of  T,  then  the  nonlinear 
space  and  linear  space  of  the  Gaussian  pro- 
cess (Xfc,  t e S)  are  denoted  by  L2(X;S)  and 

H (X; S)  respectively.  Note  that  L2(X;S)  is  a 
closed  subspace  of  L2(X)  and  H(X;S)  a closed 
subspace  of  H(X). 

Suppose  £ eH(X)  and  E £2=  t.  Then  £ is  a 
Gaussian  variable  with  mean  zero  and  variance 
t.  Applying  the  Gram-Schmidt  procedure  to 
orthogonalize  the  sequence  of  random  variables 
1,£,£2 • • • in  L2(X),  we  obtain  the  ortho- 
gonal sequence  H (£) , H .(£),  H (£),.. . . 
o,t  1, t 2, t 

H M£)  is  called  the  Hermite  polynomial  of 
P#t 

degree  p with  parameter  t,  and  is  a poly- 
nomial in  both  variables  t and  £.  The  first 
few  Hermite  polynomials  are 

h <?>  = 1 h,  .<$>  = t h,  „<o=  e2-t 


Hl.t<5)  = « 


(1)  E H „«>H,  (5)  « P!  6 tp 

P/t  q.t  pq 


(2)  exp 


{U  Pi 


H3.t<5) 


The  Hermite  polynomials  staisfy  the  following 
properties 


(3)  H (a?)  = a*  H t (O  , a > 0. 

a 

When  t * 1,  H (£)  will  be  written  as 
P»t 


For  each  p = 1,2,...  , let  H MX)  = 

H(X)  0. . .8  H(X)  and  respectively 

H^lx)  = H1(X)  «...  0 H (X)  be  the  p**1  power 
tensor  and  symmetric  tensor  products  of  H(X); 

for  p = 0,  let  H®^(X)  = Hfi^(X)  be  the  space  of 

all  constant  random  variables  in  H(X).  H *(X) 
is  a Hilbert  space  and  its  inner  produce  is 
such  that 

< v-v 

5i>nj>H(x)  Cp»  hp>H(x) 

for  all  E's  and  n’s  in  H(X).  H^tX)  is  a 

closed  subspace  of  H^tX)  spanned  by  all 
elements  of  the  form 

<5>  «!■••••  «p"pT  1 

Hi  p 

where  h = h^)  runs  through  all  permu- 

tations of  (l,*9m,P)  and  £'s  are  elements  of 
H (X) . For  further  properties  of  tensor  and 
symmetric  tensor  product  spaces  see  for  ex- 
ample 16]  and  [7] . 

Our  analyses  are  based  on  the  following 
tensor  product  structure  of  the  nonlinear 
space  of  a Gaussian  process  (see  [6]  and  [ 7 ] ) . 

THEOREM  1.  Let  X be  a zero  mean  Gaussian 
process.  Then  there  exists  a unique  iaomorph - 

lim  4 ft. am  ^h^IX)  onto  L2  (x)  tuck  that 

(6,  4 (.*  ) -e 

where  e“?  * £ (P!)'5*  5 ^,5  e H(X). 

p>0 

IF  ^ 5^  e H(X)  are  orthogonal  then  ■ 

(7)  4 2(£,) 

1 K P,£ti 

where  p = p +...+P.  . IF  Ij  , y £ r)(r  linearly 

K y 

ordered)  is  a complete  orthonomal  set  ( CONS) 
in  H (X)  then  the  family 

<8) 
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2.  NONLINEAR  ESTIMATION 


Let  X = (Xfc,  t e T)  be  a second  order 

process  with  zero  mean.  Consider  the  fol- 
lowing estimation  problem:  We  observe  X 

for  t e S,  a subset  of  T,  and  we  want  to 
estimate  an  L^functional  0 of  X based  on 

the  observations.  We  are  interested  in  find- 
ing the  best  estimate  0 , an  L2~f unctional 

of  (Xfc,  t e S)  which  minimizes  the  mean 
square  error  of  estimation  E (0-§)2.  It  is 

well  known  that  0 can  be  obtained  as  the  con- 
ditional expectation  of  0 given  X^,  t e S; 

namely 

6 = E (9  | Xt,  t t S). 

-In  general,  § is  extremely  difficult  to 
determine.  However,  if  X is  a Gaussian 
process  we  have  a complete  solution. 

Let  X be  a zero  mean  Gaussian  process 
and  (E  , Y £ T}  (T  linearly  ordered)  a 
CONS  iX  H (X) . Then,  according  to  Theorem  1, 
every  L2-functional  of  X has  the  following 

orthogonal  development 

I a 


(9)  0 


l 


pi°  p1+---+pk=p 


V- 


•<Yk 


(?TTk) 


theorem  2-  Let  X be  a zero  mean  Gaussian 

! the  ort 

Pr..Pk 


prooes3~dn3  let  9 e L.  (X)  have  the  ortho- 
gonal development  i' 9 7.  Then 


l 

p>0 


l 

Pl+...+Pk=P 

V'"<Yk 


Vsj  V 
"V  VV 


where 


t e S) 


Proj 


H (X;  S)  y 


PROOF  : Upon  identifying  L2(X)  with 


p_>0 


H^(X)  by  virtue  of  Theorem  1,  we  have 


9 * E |8|Xt,  t £ S)  = 9 

Pl--.Pk 

(£®pl  i...b  & ) 


Thus  to  show  the  theorem  it  suffices  to  show 
(10)  Proj 


jL2(X;S,  ‘ 


. ®P,  . _ -®Pk 

5 1 ®. 5 

Y1  Yk  * 


For  each  p £ L2(X;S)  write 

P - I l *>V‘^ 

q>0  qi+...+q.=q  - - *-j 


ai<*"<Sj 


- 


eq. 


n.  1 ®. . .e  n.  ^ 

4i  j 

where  {0^,6  £ A}  (A  linearly  ordered)  is 
a CONS  in  H(X;S).  We  have 


®P 


< 


i - 


epv 


p> 


l l V1-.. 


^l  - 

J < r e- - -0 


VSj  *1 


n.®<Jl  e. > 
6l  ‘ 4j 


l i 


- ®Pi  - - - ®Pk 

3 < e,  1 «--.e  c 

1 Yk 


- S 

e...e  n,  J > 


_ < f «*>,  . . -®Pk 

S.  ®...e  Cv  \p  > 

. k 

where  the  second  equality  is  a consequence 
of  properties  (1),  (3)  and  <F  , nfi> 

<,  ' V • 

Since  p £ L2<X;S)  is  arbitraty  and 

£ ' l^k  £ L2(X;S),  a0)  follows. 

Y1  • Yk 

This  completes  the  proof. 


1 

i 

\ 


& 

ki_ 


jgmawaw 


/ 
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CORQIJ.ARY  3.  If  X is  a zero  mean  Gaussian 
process  and  C e li(X),  then 
(U)  (Hp  Et2(t)  I yt  e S)  = Hpl£-2:tl, 

(12)  E (expU-’f  f,  ) | xt,  t e S)  = 

exp  (i->5  EC2}. 

If  X is  a zero  mean  Gaussian  martingale  then 

2 

Y.  *=  H r 2 <x  ) cz/i<2  Z.  = exp  (X  -1*  Ex  } are 
t p#tX{,  t t t t 

martingales . 

(The  last  statement  is  well  known  for  X a 
Weiner  process  and  p = 2.) 


PROOF.  (11)  and  (12)  follows  from  proper- 
ties (2  )#  (3  ) and  Theorem  2.  The  last 

assertion  is  an  immediate  consequence  of  (11) 
and  (12). 

If  X is  a zero  mean  Gaussian  process  and 
T *=  (-•  ,®  ) (or  any  interval)  then  by  the 
corollary  we  have  that  for  all  s t 


(13)  E {Hp,fx  2<V  I Xu,  u < s}  = 

H ri  . (X.  > 

p,EXt2s  t,s 


where^ 


t,s 


E(X. 


u < s) . 


An  expression  for  X can  always  be  obtained 
via  the  Cramer  - HidA  representation  of  X: 


N 


(t,u) 


d z 


(n) 

u 


. Then  we  have 
• N 


X 


t,s 


- 1 

n=l 


(t,u)  d z 


(n) 

u 


The  case  with  p = 2,  i.e.  the  L -functional 

2 2 2 

- E Xfc  , is  considered  in  [2)  for  a very 

special  class  of  Gaussian  processes  X.  It 
should  be  clear  that  whenever  a sirple  expres- 
sion is  available  for  X , then  (13)  gives 
t , s 

a simple  expression  for  the  nonlinear  predictor 

of  the  L -functional  H - ->  (X  ) . 

2 t 

We  close  this  section  Dy  solving  a sim- 
ple estimation  problem.  Let  ( X 0<t<T)  be 

a stationary  reciprocal  Gaussian  process  with 
Ext«0,5xt2=3l  , and  continuous  covariance  func- 
tion R(t,s)  = R(t-s).  It  is  known  ( 5 1 that 
R(t)  must  take  one  of  the  following  forms: 

e at,  a > 0;  cos  at,  a > 0 and  T<_  ir/a; 

1 - at,  0 a £ 2/T.  Let  0<u<t<v<T 
be  given.  We  desire  to  estimate  6,  an  L^- 
functional  of  X^,  based  on 

X#,  s e S * [0,u]  V/  [v,T].  By  reciprocality 
we  have 

X « E (X,  I X , s E S)  = aX  + 6X  ; 
t t s'  u v 


and  an  easy  computation  shows  that 

R (n-t ) - k ( v - 1 ) R ( u-v)  . R ( v-t)-R(u-t)!-'  ( u-  v ) 


t a= 


l-R  (u-v) 


1-R  (u-v) 


Since  6 is  an  L^- functional  of  Xfc,  it  has 


the  orthogonal  development  0 - 

[ a H F*  (X  ) . Thus  by  Theorem  2 the 
p>0  p p'  txt2 


best  estimate  of  G is  given  by 


0 = l aH  r*  , X ) = 
X P p,tX  2 t 
p>0  t 


^napHp,a2+B2+2a0R (u-v)  *aXu  + ^Xv*  * 

P20 


"1 


3.  NONLINEAR  PREDICTION 

Consider  the  following  prediction  pro- 
blem for  a class  of  processes:  Let  X = 

(X^,  t e T) , T an  interval,  be  a second  order 

process  and  let  Y = 0t(Xt)  with  et  a real 
function  such  that  5Yfc  = 0 and  EY^2<ao  for  all 

t z T.  Suppose  on  the  basis  of  the  (past) 
values  of  Y = (Ys,  s < t)  up  to  time  t we 

want  to  find  the  best  prediction  of  the  future 
value  of  ^t+T  for  fixed  t > 0. 

Two  predictors  are  of  special  interest: 
the  optimal  linear  predictor  Y^(t,i)  and 

the  optimal  nonlinear  predictor  Y^ft.i). 

optimality  is  in  the  sense  of  minimizing  the 
mean  square  error  within  the  class  of  all  linear 
find  nonlinear  predictors  respectively.  It  is 
well  known  that 

yt,T)  = Pr°jH(v  ,lt)  Yt+i  , 

“ E(ytJ  V S • 

The  corresponding  mean  square  prediction  errors 
are  denoted  by 

0£2(t,T)  = E <*t4x-  Y^(t,T))2, 

°n l (t'T)  " E(Vt"  V(t'T,)2- 

Now  introduce  a super  predictor  Y^  (t,T) 

to  be  the  nonlinear  prediction  of  Y based 

t+x 

on  Xg,  s ^ t,  i.e. 

V^1  - ‘yt+xl  V s it)! 

its  mean  square  prediction  error  is  denoted  by 
os2(t,x).  It  is  clear  that 
(14)Os2(t,T)<  02^t,T)  < o|  (t,T) 


1 

1 

i 


Hi  : 


2 

and  thus  os  provides  a lower  bound  for  the  moan 

square  errors  of  linear  and  nonlinear  predic- 
tor s.  If  X is  a Gaussian  process,  o^(t,x)  can 

be  obtained  by  solving  an  estimation  problem  as 
discussed  in  Section  2.  If,  in  addition,  0^ 

happens  to  be  a 1-1  function  for  each  t then 
the  o-fields  generated  by  and  Yfc  coincide. 

In  this  case  Y „(t,x)  = Y (t,x)  and  the  nca- 
ni  s 

linear  predictor  can  be  obtained  by  solving  an 
estimation  problem  again. 

We  now  turn  to  the  important  special  case 
where  X = (X^,-®  < t < ®)  is  a zero  mean  sta- 
tionary Gaussian  process  with  covariance  func- 
tion R(t,s)  * R(t-s)  and  * 0 for  all  t.  In 

this  case  we  can  calculate  o^(t,x)  = o2(x)  as 

s s 

follows.  Write 

(13  V - 9<Xt>  = l a H 2<Xt> 

P>1  v p.  o 

2 2 

where  o = ^Xt*  Clearly  Y is  a stationary  pro- 
cess' with  Ey^_  = 0 and  £y2  = 7 pia2o2p  < 03 . 
t t **  p 

Since  for  £,n  e H(X) 

Eh  ,(E)h  _ Cn>  - p!<e®p,n8p> 

P,EE  P,En 

- p!<E,n>p. 

and  if  p ^ q 


It  is  well  known  from  the  general  theory  of 
stationary  process  that  Oq(x)  can  be  obtained 

analytically  (if  not  explicitly)  through  t'.ie 
Wiener-Paley  factorization  theorem  if  X is  regu- 
lar (i.e.  H(Xs#  s f.  t)  s {0})-  It  can  be 

shown  that  if  X is  regular  so  is  Y,  and  there- 
2 

fore  o^(x)  can  also  be  obtained  analytically. 

Jaglom  [4]  has  considered  the  problem  of 
comparing  the  performance  of  optimal  linear  and 
nonlinear  predictors  for  polynomial  functions  of 
certain  stationary  Markov  processes.  Donelson 
and  Maltz  [11  studied  this  problem  in  detail  for 
polynomial  functions  of  the  Orns tein-Uhlenbeck 
process.  The  inequality  (14)  plays  a central 
role  in  such  studies. 

As  an  example  consider  X the  Ornstein-Uhlen- 
beck  process  and  Y a nonlinear  function  of  X 
given  by  (15)-  Recall  that  the  Ornstein-Uhlen- 
beck  process  is  a Gaussian  process  with  zero 

- I t-s I 

mean  and  covariance  R(t-s)  = e 1 1 . By  the 

Markov  property  we  have 

X { t,  T ) = E(X^  I X , s < t)  = e TX  - 

t+T  S — t 

Thus  it  follows  from  (17)  and  (18)  that 


Vfc'T)  - Jy  -2T(e'rV 

p>l  p,e 
“ l a e pTH  (X  ) , 

**  r,  of- 


. E h (Oh  (n)  = 0, 
p.n  q.En 

it  follows 

tt6)  E YtYs  = Epla^U-s) . 

And  (L6  ) implies  that  if  X is  mean  square  con- 
tinuous so  is  Y. 

Let  X(t,T>  = E (X  I xs>  s i 

be  the  optimal  nonlinear  predictor  of  Xfc+T 

(which  is  also  the  optimal  linear  predictor 
since  X is  Gaussian),  and  a^lt)  be  the  mean 

square  prediction  error.  Then  by  Corollary  3 

ft7)  Y (t, t)  — l a H p-  (X(t,T)) 

S 5^  p p,tX(t,T) 

* and  hence 

' (18)  02<t)  “ E(yt+T  - Vt,T>)2 

- Eyt+T  - 


(i)  * l p!a  (1  - e pT)  . 


“ l P!aD°2P*  I Pla^(02-0Q(T) )P 
p>d  P p>l  V 


This  result,  with  Y a polynomial  function  of  X, 
has  been  obtained  by  Donelson  and  Maltz  using 
a different  approach. 

Finally,  we  remark  that  if  Y^  * H^fX^)  then 

int,t'T)  “ Js(t’T)  = e’P\' 

o2.(t)  - o2(t)  = 1 - e"2pT. 

nl  s 
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